Performance Optimisations of the Npb Ft Kernel by Special-purpose Unroller

ثبت نشده
چکیده

The fast Fourier transform (FFT) is the cornerstone of many supercomputer applications and therefore needs careful performance tuning. Most often, however, the real performance of the FFT implementations is far below the acceptable gures. In this paper, we explore several strategies for performance optimisations of the FFT computation , such as enhancing instruction-level parallelism, loop merging, and reducing the memory loads and stores by using a special-purpose automatic loop unroller. Our approach is based on the principle of complete unrolling which we apply to modify the FT kernel of the NAS Parallel Benchmarks (NPB). In experiments on two different IBM SP2 platforms, our automatically generated unrolled FFT subroutine is shown to improve the performance between 40% and 53% in comparison with the original code. Further, the execution time of the entire 3-D FFT mega-step of the benchmark is faster than when calls to a similar FFT subroutine from the vendor-optimised PESSL numerical library are used. Preliminary results suggest that the completely unrolled code also outperforms FFTW, another high-performance FFT package. Finally, our approach for automatic generation of moderately optimised but specialised codes requires only a modest amount of programming eeort.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance optimisations of the NPB FT kernel by special-purpose unroller

The fast Fourier transform (FFT) is the cornerstone of many supercomputer applications and therefore needs careful performance tuning. Most often, however, the real performance of the FFT implementations is far below the acceptable figures. In this paper, we explore several strategies for performance optimisations of the FFT computation, such as enhancing instruction-level parallelism, loop mer...

متن کامل

سنتز و فرمولاسیون سیستم بایندر- نرم‌کننده NHTPB-NPB و بررسی خواص عملکردی آن در PBXN-109 اصلاح ‌شده

In this work, nitration of low molecular weight polybutadiene (PB) by a convenient and inexpensive procedure has been investigated. The product (Nitropolybutadiene (NPB) energetic plasticizer) was characterized by FT-IR, 1H-NMR, GPC, TGA, DSC etc. Then NPB energetic polymer plasticizer and nitro-hydroxyl terminated polybutadiene (NHTPB) binder have been replaced with dioctyladiphate (DOA) inert...

متن کامل

Performance assessment of parallel techniques

The goal of this work is to evaluate and compare the computational performance of the most common parallel libraries such as Message Passing Interface (MPI), High Performance Fortran (HPF), OpenMP and DVM for further implementations. Evaluation is based on NAS Parallel benchmark suite (NPB) which includes simulated applications BT, SP, LU and kernel benchmarks FT, CG and MG. A brief introductio...

متن کامل

User-Level VSM Optimization and its Application

This paper describes user-level optimisations for virtual shared memory (VSM) systems and demonstrates performance improvements for three scientiic kernel codes written in Fortran-S and running on a 30 node prototype distributed memory architecture. These optimisations can be applied to all consistency models and directory schemes, whether in hardware or software, which employ an invalidation b...

متن کامل

Neural Network-Based Learning Kernel for Automatic Segmentation of Multiple Sclerosis Lesions on Magnetic Resonance Images

Background: Multiple Sclerosis (MS) is a degenerative disease of central nervous system. MS patients have some dead tissues in their brains called MS lesions. MRI is an imaging technique sensitive to soft tissues such as brain that shows MS lesions as hyper-intense or hypo-intense signals. Since manual segmentation of these lesions is a laborious and time consuming task, automatic segmentation ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999